Efficient Processing of the Cube Operator

نویسنده

  • Martin Zirkel
چکیده

This paper presents a part of the doctoral work with the theme: “The impact of sorted reading from UB-trees on relational database systems”. Based on [Mar99] this doctoral work deals with the problems of the efficient implementation and analysis of sorted reading (UB-Cache [Bay97b], Tetris -Algorithm [MB98], [MZB99]) and its effect on query-processing and query optimization in relational databases and especially in the field of data warehousing (DW) and data-mining. In our days, the B-Tree [BM72] is a de facto standard in relational database systems for onedimensional data. Based on the B-Tree a new data structure for multidimensional data, the UB-Tree, was invented by [Bay96]. This data structure utilizes a space-filling curve to partition a multidimensional universe into disjoint regions while preserving multidimensional clustering. A space-filling curve (e.g., Z-curve) maps a ddimensional universe to a one-dimensional universe. Therefore algorithms for point queries, insertion, and multidimensional range queries can be efficiently handled by a normal B*-tree, which is available in any commercial relational database system. In our prototype implementation we use the Z-curve. A Z-address Z(x) is the ordinal number of a tuple x on the Z-curve which is used a key for the B*-Tree [Mar99].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Dwarf Data Cube Eliminates the High Dimensionality Curse

The data cube operator encapsulates all possible groupings of a data set and has proved to be an invaluable tool in analyzing vast amounts of data. However its apparent exponential complexity has significantly limited its applicability to low dimensional datasets. Recently the idea of the dwarf data cube model was introduced, and showed that highdimensional “dwarf data cubes” are orders of magn...

متن کامل

New Approach of Computing Data Cubes in Data Warehousing

The paper is dealing with data cubes built for data warehouse for OLAP purposes. OLAP (Online Analytical Processing) system offers multidimensional data analysis in which large volume of historically collected data is computed. To decrease the query time and to provide various options to the analysts, a data model was designed to organize data perfectly in a multidimensional data model. In OLAP...

متن کامل

Dynamic Multidimensional Data Cubes

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. ABSTRACT Data cubes are ubiquitous tools in data warehousing, online analytical processing, and decision support applications. Based on a selection of precomputed and materialized aggregate values, they can dramatically speed up aggregation and summ...

متن کامل

On Computing the Data Cube

On-Line Analytical Processing (OLAP) applications often require computation of multiple related group-bys. This paper presents fast algorithms for computing a collection of groupbys. We focus rst on a special case of the aggregation problem|computation of the cube operator. The cube operator requires computing group-bys on all possible combinations of a list of attributes. Our algorithms extend...

متن کامل

Efficacious Data Cube Exploration by Semantic Summarization and Compression

Data cube is the core operator in data warehousing and OLAP. Its efficient computation, maintenance, and utilization for query answering and advanced analysis have been the subjects of numerous studies. However, for many applications, the huge size of the data cube limits its applicability as a means for semantic exploration by the user. Recently, we have developed a systematic approach to achi...

متن کامل

Parallel Multi-Dimensional RolaP Indexing1

This article addresses the query performance issue for Relational OLAP (ROLAP) datacubes. We present RCUBE, a distributed multidimensional ROLAP indexing scheme which is practical to implement, requires only a small communication volume, and is fully adapted to distributed disks. Our solution is efficient for spatial searches in high dimensions and scalable in terms of data sizes, dimensions, a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000